Combinatorial Methods for Disease Association Search and Susceptibility Prediction
نویسندگان
چکیده
Accessibility of high-throughput genotyping technology makes possible genome-wide association studies for common complex diseases. When dealing with common diseases, it is necessary to search and analyze multiple independent causes resulted from interactions of multiple genes scattered over the entire genome. This becomes computationally challenging since interaction even of pairs gene variations require checking more than 10 possibilities genome-wide. This paper first explores the problem of searching for the most disease-associated and the most disease-resistant multi-gene interactions for a given population sample of diseased and non-diseased individuals. A proposed fast complimentary greedy search finds multi-SNP combinations with non-trivially high association on real data. Exploiting the developed methods for searching associated risk and resistance factors, the paper addresses the disease susceptibility prediction problem. We first propose a relevant optimum clustering formulation and the model-fitting algorithm transforming clustering algorithms into susceptibility prediction algorithms. For three available real data sets (Crohn’s disease (Daly et al, 2001), autoimmune disorder (Ueda et al, 2003), and tick-borne encephalitis (Barkash et al, 2006)), the accuracies of the prediction based on the combinatorial search (respectively, 84%, 83%, and 89%) are higher by 15% compared to the accuracies of the best previously known methods. The prediction based on the complimentary greedy search almost matches the best accuracy but is much more scalable.
منابع مشابه
Combinatorial Analysis of Disease Association and Susceptibility for Rheumatoid Arthritis SNP Data
In this paper we analyze the SNP data (GAW Problem Set 2) for rheumatoid arthritis (RA) trying to check if it is caused by combinations of several unlinked gene variations. We apply here improved versions of combinatorial methods recently reported in [3, 4]. Disease association analysis searches for a SNP with frequency among diseased individuals (cases) considerably higher than among non-disea...
متن کاملDiscrete Algorithms for Analysis Of
Accessibility of high-throughput genotyping technology makes possible genomewide association studies for common complex diseases. When dealing with common diseases, it is necessary to search and analyze multiple independent causes resulted from interactions of multiple genes scattered over the entire genome. The optimization formulations for searching disease-associated risk/resistant factors a...
متن کاملPrediction Methods for Inherited Disease
Recent improvements in the accessibility of high-throughput genotyping have brought a great deal of attention to disease association studies[6]. It is believed that more accurate disease association is achieved with inferred haplotypes rather than with directly available genotypes. The main goal of disease association analysis is to identify gene variations or, in general, haplotypes which cont...
متن کاملAssociation of miR-146a rs2910164 and miR-27a rs895819 polymorphisms with type 2 diabetes susceptibility: A Meta-Analysis
Background and Aim: Several investigations have evaluated the association of miR-146a rs2910164 and miR-27a rs895819 single-nucleotide polymorphisms with susceptibility to type 2 diabetes (T2D). However, the findings are conflicting and inconclusive. Therefore, this meta-analysis performed to investigate the association between these polymorphisms and T2D risk. Methods: Studies were identifie...
متن کاملPsoriasis prediction from genome-wide SNP profiles
BACKGROUND With the availability of large-scale genome-wide association study (GWAS) data, choosing an optimal set of SNPs for disease susceptibility prediction is a challenging task. This study aimed to use single nucleotide polymorphisms (SNPs) to predict psoriasis from searching GWAS data. METHODS Totally we had 2,798 samples and 451,724 SNPs. Process for searching a set of SNPs to predict...
متن کامل